Cultural heritage guide system: a combination of augmented reality, deep learning and culture technology
Maryam Shakeri, Abolghasem Sadeghi-Niaraki
Geoinformation Technology Center of Excellence, Faculty of Geodesy & Geomatics Eng., K.N.
Toosi University of Technology
[email protected]; [email protected]
Abstract
Cultural heritage site is a place to learn about the culture and history of a country, however, where nowadays there are no creative ways to attract more visitors. In this regard, augmented reality (AR) technology can be used as an attractive way to facilitate access to cultural and historical information. One of the important challenges in the AR is the object recognition to accurately augment relevant information to real objects. This paper aims to design a guide system based on Culture Technology using AR and a deep learning method for recognizing cultural objects. To this end, fist the architecture of the system and convolutional neural networks architecture for object recognition based on MobilNet are designed. Then the proposed system is implemented for the cultural objects of the SaadAbad Palace in Tehran, Iran. At the end, the methodology is benchmarked using confusion matrices
Keywords- Augmented Reality; Deep Learning; Object Recognition; Cultural Heritage; Culture Technology
1. Introduction
Iranian cultural heritage sites such as SaadAbad Palace represent the rich and ancient culture of Iran, however, current culture of visiting makes the numbers of their visitors low due to lack of creativity and suitable tools in these places. In these places, usually tour guides explain the cultural objects for several visitors or the visitor read the label of each object. These ways are problems such as a small number of tour guide that makes no ones in some places, inconspicuous and small label of objects that visitors cannot read, and the same explanations guide for everyone that makes people unable to get their desired information. These problems make that the visitors may lose many the cultural and historical information, for example, in the Saadabad palace, the visitor may pass the statue of Arash of Anchor without regard to its history. However, Culture Technology (CT) can provide personalized information and services to the visitor by combining culture and technology.
Augmented Reality (AR) as one of the technologies of CT has an important role in propagating culture in a society. CT is study of interaction between culture and digital technology. CT consists of different technologies including AR. AR is a ubiquitous and interactive interface that adds computer-generated information to objects and positions of real world environment [1]. AR is able to create new value on cultural heritage sites, through which people can gain the rich experience and knowledge of the various cultural objects in an attractive and simple way [2]. Recently, AR are used for cultural heritage either as tools for upgrading scientific work or as guide system providing personalized information to users.
As an essential area of AR research, object recognition and tracking must done accurately to combine real world and relevant information. In general, there are two types of AR: marker-based and marker-less. Marker-less AR which recognize an objects without markers is preferred due to making variety interaction and using unassigned object [3]. Different methods have been used in object recognition and tracking, but recent advances in hardware and algorithms have sparked an interest in deep learning algorithms. Deep learning methods which attempts to artificially emulate the functionality of the human brain via hardware and software have successfully been used in complex recognition tasks such as object classification [4].
Recently, several works were used deep learning methods for object recognition and tracking in AR domain. In intelligent transportation systems, a fast deep CNN method was proposed for obstacle recognition for AR based driver information system [5]. A lightweight CNN object detection method was proposed to develop markerless outdoor mobile AR. Its results were combined with objects’ corresponding spatial relationships to achieve the precise registration [6]. Faster R-CNN method was used for real-time object recognition to develop Marker-less AR for 3D integral imaging [3]. DeepAR method was introduced based on AlexNet, well known CNN architecture, and HIPS, an efficient matching algorithm to develop marker-based AR [4].
This paper aims to design a cultural heritage guide system based on AR and deep learning for recognizing and classifying cultural objects. The system enables visitors to searching and browsing large collections of general and cultural heritage information repositories using minimal interaction. I this regards, MobileNet which is depth-wise separable convolutions and is more suitable for mobile applications are used for object recognition. Furthermore, a prototype implementation that can significantly improve tourism experiences is provided using data gathered from cultural objects of SaadAbad Palace based on Keras library and Python programming.
Section 2 presents related works. Section 3 introduce the architecture and structure of proposed system for culture heritage. Section 4 provides experimental results obtained for cultural heritage objects of SaadAbad Palace. Section 5 concludes this paper.
2. Cultural Heritage Guide System
The Cultural heritage guide system enhances the visitors experience by combining AR technology and cultural and historical contents of cultural objects. It provides the AR based service to display suitable information to visitors about cultural heritage including “What is the visitor see?”, “How did artists look at this location?”, “What is the history?”, “What kind of stories are related?”, “Which events have taken place?”, “Which persons were involved in this place?” and “What is my next stop?”. These information are augmented to real cultural heritage objects through camera of visitor’s smart phone. In other words, when the visitors are in the cultural heritage sites and point to a cultural object through their smart phone, the cultural objects is recognized and then suitable information of it are displayed on camera.
Fig. 1 shows the client-server architecture of the system. There are two main parts for this system, android side and server side. Android side is an AR application which recognizes cultural objects and display information according to object recognition. The server side consists of databases that store information about cultural heritage objects. The application will be start by opening camera of smart phone and get each camera frame as an image. Then the image send as an input of object recognition method and the application will run it by using the trained deep learning method. After the cultural object is recognized, its id send to the server to retrieve right information. The server retrieves information based on objects’ id and send result back to the user.
Fig. 1 Cultural heritage guide system
Fig. 2 shows the algorithm flow for the object recognition method based on the client-server architecture explained above. It starts with AR application and ends with displaying information on camera frame of smart phone. Only object Id and information transfer between the client and the server.
For object recognition and classification, a deep convolutional neural network detector is introduced based on MobileNet CNN architectures. MobileNet is an efficient network architecture and in order to build very small, low latency models which is suitable for mobile and embedded vision applications. It is based on depthwise separable convolutions [7] and have totally 3,233,989 parameters. Fig. 3 shows the MobileNet architecture.
Fig. 2 Flowchart of the client-server model based object recognition
Fig. 3 MobileNet Architecture
3. Experimental Results
SaadAbad Palace was selected as a case study to implement and test the method. The cultural and historical complex of Saadabad covers an area of 110 hectares and was built by the Qajar and Pahlavi monarchs. Today it is used as a museum in which various cultural objects are kept that each of them reflect Iran's history and culture, for example the status of Arash. Arash the Archer is remains a popular name among Iranians. Arash the Archer is a heroic archer of Iranian who sacrificed his life to preserve the territorial integrity of Iran in the war between the Iranians and Turanians. In this paper, five cultural objects were selected (Fig. 4)
To implement the model, images were gathered from the five cultural objects (Fig. 4). For each object, about 110 images are captured from different views. The 80 reference images of the objects in Fig. 5 have been used to train the model. 20 and 10 images per object are used for validation and test respectively. The image were captured from different viewpoints of each cultural objects. The images are given in Fig. 5 to provide the reader with an idea about the variation in viewpoint and imaging conditions.
|
|
|
|
Fig. 4 Cultural objects of SaadAbad Palace
Fig. 5 Images from different viewpoint and illumination conditions of cultural objects
To develop this research, we used a laptop with setup of Intel® Core i5-3210M ~ 2.5 GHz processor, 64-bit operating system, and 8.0 GB memory. Tensorflow platform, Keras library and python programming were used to write the program. The model was run in 100 epochs and test was run in 5 epochs.
The detection results are displayed via confusion matrices in Table 1. The overall correct classification accuracy (the average of the diagonal elements of the confusion matrix) is 0.92, which indicates good results.
Table 1. The confusion matrix for average of five cultural objects
|
C1 |
C2 |
C3 |
C4 |
C5 |
C1 |
80% |
0 |
20% |
0 |
0 |
C2 |
0 |
100% |
0 |
0 |
0 |
C3 |
10% |
0 |
90% |
0 |
0 |
C4 |
0 |
0 |
0 |
100% |
0 |
C5 |
0 |
0 |
0 |
10% |
90% |
4. Conclusion
Using CT, we could change the culture of visiting cultural heritage sites to create more attractive environment to attract more visitors. In this regards, the cultural heritage guide system was designed to provide personalized information about cultural objects. The system facilitate real-time access to cultural and historical information such as texts, images and films by pointing cultural objects. This way makes it possible for the user to concentrate more on cultural content than the way information is presented.
For the object recognition in the system, the convolution network based on MobileNet model was designed and implemented for cultural objects of SaadAbad Palace. The overall correct classification was obtained 0.92 which is good result. This good result may be due to a small number and variety of cultural objects and a small number of test data.
Acknowledgment
We sincerely thank all the people who contributed to this work. This research was done as a part of international Culture Technology dual degree program between K. N. Toosi University of Technology and KAIST.
References
[1] Schmalstieg, D. and G. Reitmayr, The world as a user interface: Augmented reality for ubiquitous computing, in Location based services and telecartography. 2007, Springer. p. 369-391.
[2] Tscheu, F. and D. Buhalis, Augmented reality at cultural heritage sites, in Information and Communication Technologies in Tourism 2016. 2016, Springer. p. 607-619.
[3] Sutanto, R.E., L. Pribadi, and S. Lee. 3d integral imaging based augmented reality with deep learning implemented by faster r-cnn. in International Conference on Mobile and Wireless Technology. 2017. Springer.
[4] Akgul, O., H.I. Penekli, and Y. Genc. Applying deep learning in augmented reality tracking. in Signal-Image Technology & Internet-Based Systems (SITIS), 2016 12th International Conference on. 2016. IEEE.
[5] Abdi, L. and A. Meddeb, Driver information system: a combination of augmented reality, deep learning and vehicular Ad-hoc networks. Multimedia Tools and Applications, 2017: p. 1-31.
[6] Rao, J., et al., A Mobile Outdoor Augmented Reality Method Combining Deep Learning Object Detection and Spatial Relationships for Geovisualization. Sensors, 2017. 17(9): p. 1951.
[7] Howard, A.G., et al., Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861, 2017.